High-resolution 
quantization and entropy coding 
for fractional Brownian motion 



by 

S. Dereich and M. Scheutzow 

Technische Universitdt Berlin 

Summary. We derive a high-resolution formula for the quantization and entropy cod- 
ing approximation quantities for fractional Brownian motion, respective to the supre- 
mum norm and £ p [0, l]-norm distortions. We show that all moments in the quanti- 
zation problem lead to the same asymptotics. Using a general principle, we conclude 
that entropy coding and quantization coincide asymptotically. Under supremum- 
norm distortion, our proof uses an explicit construction of efficient codebooks based 
on a particular entropy constrained coding scheme. This procedure can be used to 
construct close to optimal high resolution quantizers. 

Keywords. High-resolution quantization; complexity; stochastic process; entropy; 
distortion rate function. 

2000 Mathematics Subject Classification. 60G35, 41A25, 94A29. 

1 Introduction 

Functional quantization and entropy coding concern the finding of "good" discrete approx- 
imations to a non-discrete random signal in a Banach space of functions. Such discrete 
approximations may serve as evaluation points for quasi Monte Carlo methods or as an 
information reduction of the original to allow storage on a computer or transmission over 
some channel with finite capacity. In the past years, research in this field has been very ac- 
tive, which resulted in numerous new results. Previous research addressed, for instance, the 
problem of constructing good approximation schemes, the evaluation of the theoretically 
best approximation under an information constraint, existence of optimal approximation 



schemes and regularity properties of the paths of optimal approximations. The above 
questions are treated for Gaussian measures in Hilbert spaces by Luschgy and Pages ( [lit - 
|l2() and by the first-named author in |3(. For Gaussian originals in Banach spaces, these 
problems have been addressed by the authors and collaborators in 0, 0|, 0], 0| and by 
Graf, Luschgy and Pages in 0]. For general accounts of quantization and coding theory 
in finite dimensional spaces, see 0] and 0] (see also 1C|). 

In this article, we consider the asymptotic coding problem of fractional Brownian mo- 
tion for the supremum and L p [0, l]-norm distortions. We derive the asymptotic quality of 
optimal approximations. In particular, it is shown that efficient entropy constrained quan- 
tizers can be used to construct close to optimal quantizers when considering the supremum 
norm. Moreover, for one of the above norm-based distortions, all moments and both in- 
formation constraints lead to the same asymptotic approximation quality. In particular, 
quantization is asymptotically just as efficient as entropy coding. The main impetus to 
the present work was provided by the necessity to understand the coding complexity of 
Brownian motion in order to solve the quantization (resp. entropy constrained coding) 
problem for diffusions (see 0]). 

Let (p., A, P) be a probability space, let H 6 (0,1) and let X = (X t )t>o denote frac- 
tional Brownian motion with Hurst index H on (f2,.A, P), i.e. (Xt)t>o is a centered con- 
tinuous Gaussian process with covariance kernel 

K{t,s)= l -[t 2H + s 2H -\t-s\ 2H ], t,s>0. 

We need some more notation. In the sequel, C[0, a], a > 0, and D[0, a] denote the space of 
continuous real- valued functions on the interval [0, a] and the space of cadlag functions on 
[0, a], respectively. Both spaces are endowed with the supremum norm || • ||[o, a ]- Moreover, 
we let (-£^[0, a], || • ||iP[o,a]) denote the standard L p -space of real- valued functions defined on 
[0, a]. Finally, || • || 9 , q € (0,oo], denotes the L 9 -norm induced by the probability measure 
P on the set of real- valued random variables. 

Let us briefly introduce the main objectives of quantization and entropy coding. Let 
E and E denote measurable spaces, and let d : E x E — > [0, oo) be a product measurable 
function. For a given E- valued r.v. Y (original) and moment q > 0, the aim is to minimize 

IK»(y))|| 9 (i) 

over all measurable functions tt : E — > E with discrete image (strategy) that satisfy a 
particular information constraint parameterized by the rate r > 0. 

Entropy coding (also known as entropy constrained quantization in the literature) con- 
cerns the minimization of (^Q) over all strategies tt having entropy HI(7r(Y)) at most r. 
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Recall that the entropy of a discrete r.v. Z with probability weights (p w ) is defined as 

w 

In the quantization problem, one is considering strategies ir satisfying the range con- 
straint: | range (tt(Y))\ < e r . The corresponding approximation quantities are the entropy- 
constrained quantization error 

D( e \r\Y,E,E,d,q) := inf \\d(Y,7r(Y))\\ g , (2) 

where the infimum is taken over all strategies tt with entropy rate r > 0, and the quanti- 
zation error 

D^(r\Y,E,E,d,q) := inf \\d(Y,ir(Y))\\ q , (3) 

the infimum being taken over all strategies ir having quantization rate r > 0. Often, all 
or some of the parameters Y, E, E, d, q are clear from the context. Then we omit these 
parameters in the quantities and D^ q \ The quantization information constraint is 
more restrictive, so that the quantization error always dominates the entropy coding error. 
Moreover, the coding error increases with the moment under consideration. 

Unless otherwise stated, we choose as original Y = X and as original space E = 
C[0,oo). We are mainly concerned with two particular choices for E and d. In the first 
sections, we treat the case where E = B[0, 1] and d(f,g) = \\f — fli|[o,i]- in this setting we 
find: 

Theorem 1.1. There exists a constant k = k(H) € (0, oo) such that for all q\ G (0, oo] 

and q2 £ (0, oo), 

lim r H D {e) (r\ qi ) = lim r H D iq) (r\q 2 ) = k. 

r— »oo r— >oo 

Remark 1.2. In the above theorem, general cadlag functions are allowed as reconstruc- 
tions. Since the original process is continuous, it might seem more natural to use continu- 
ous functions as approximations. The following argument shows that, for a finite moment 
q > 0, the space E = D[0, 1] can be replaced by E = C[0, 1] without changing and 
D (e \ Let tt : C[0, 1] -> D[0, 1] be an arbitrary strategy and let r n : D[0, 1] -> C[0, 1] denote 
the linear operator mapping / to its piecewise linear interpolation with supporting points 
0, ± 2 ...1. Then 



\\\\X - r n o 7r(X)|| [0il] || ? < ||||r n (X) - r n o 7r(X)|| [0)1] || ? + \\\\X - r n (X)\\ m \\ q 

< ||||X-7r(X)|| [0il] || 9 + ||||X-r n (X)|| [0il] || 9 . 

Note that the second term vanishes when n tends to infinity and that r ra o tt satisfies the 
same information constraint as tt. 
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In the last section we conclude the article with a discussion of the case where E = 
L p [0, 1] and d(f,g) = \\f — <?||j>[o,i] for some p > 1. In this case, one has the following 
analog to Theorem ll.il 

Theorem 1.3. For every p > 1 there exists a constant k = n(H,p) £ (0,oo) such that for 
all q £ (0, oo), 

lim r H D {e) {r\q) = lim r H D^ q) {r\q) = k. 

Remark 1.4. It is again possible to replace the space E = L p [0, 1] by E = C[0, 1] 
without changing D^ q > and D^ e \ Indeed, for e > 0, let h £ : R — > [0, oo) denote a smooth 
function supported on [—£,£] with J f £ = 1, and define t £ : L p [0, 1] — > C[0, 1] through 
T e (/)(£) = Jq 1 f(s) h{t — s) ds. Then for a given strategy ir : C[0, 1] — > ^ p [0, 1] one obtains 

||||X-r £ o7r(X)|| LP[0)1] || 9 < ||||r £ (X)-r £ o7r(X)|| L p [0)1] || 9 + ||||X-r £ (X)|| iP[0)1] || g 

< ||||X - n(X)\\ LPm \\ q + ||||X - r £ (X)|| LP[0il] || g , 

where the last inequality is a consequence of Young's inequality. Now for e j the second 
term converges to 0. 

For ease of notation, the article is restricted to the analysis of 1-dimensional processes. 
However, when replacing (X t ) by a process (xj. 1 ^, . . . ,X^ d ') consisting of d independent 
fractional Brownian motions, the proofs can be easily adapted, and one obtains analogous 
results. In particular, it is possible to prove analogs of the above theorems for a multi 
dimensional Brownian motion. 

Let us summarize some of the known estimates for the constant k in the case where 
X is standard Brownian motion, i.e. H = 1/2. 

• When E = B[0, 1] and d(f,g) = \\f — g\\\o,i], the relationship between the small ball 
function and the quantization problem (see 0) leads to 

r n i 

• For E = L p [0, 1], p > 1, and d(f,g) = \\f — ff||[LP[o,ii) « m &y again be estimated via 
a connection to the small ball function. Indeed, letting 



Ai = inf|/ \x\ p ip 2 (x)dx + ± I {^>' {x)f dx}, 



where the infimum is taken over all weakly differentiable ip G L 2 (R) with unit norm, 
one has 

k S [c, V8 c] 
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n the case where p = 2, the constant k is known explicitly: k = ^ (see Q| and 

!)■ 

The article is outlined as follows. In Sections |2] to |5] we consider the approximation 
problems under the supremum norm. We start in Section |2] by introducting a coding 
scheme which plays an important role in the sequel. In Section we use the construction 
of Section 12 and the self similarity of X to establish a polynomial decay for D^ e \-\oo). In 
the following section, the asymptotics of the quantization error are computed. The proof 
relies on a concentration property for the entropies of "good" coding schemes (Proposition 
14.4)1 . In Section 03 we use the equivalence of moments in the quantization problem to 
establish a lower bound for the entropy coding problem. In the last section, we treat the 
case where the distortion is based on the L p [0, l]-norm, i.e. d(f,g) = \\f — gWi^mi]', we 
introduce the distortion rate function and prove Theorem 11.31 with the help of Shannon's 
source coding Theorem. 

It is convenient to use the symbols ~, < and ~. We write / ~ g iff lim ^ = 1, while 
/ < g stands for lim sup ^ < 1. Finally, / ~ g means 

/ / 
< lim inf — < lim sup — < oo . 

g g 

2 The coding scheme 

This section is devoted to the construction of strategies ir^ : C[0, n] — > B[0, n] which 
we will need later in our discussion. The construction depends on three parameters: 
M e N\{1}, d > and a strategy vr : C[0, 1] -► D[0, 1]. 

We define the maps by induction. Let w S C[0, oo) and set (^™^)te[o,i] := ( w t+n — 
w n)t&[o,i] an d tit := 7r(u/°))(t) for t £ [0,1). Assume that (^) tg p,n) ( n £ N) has already 
been defined. Then we choose £ n to be the smallest number in {— d + 2kd/(M — 1) : k = 
0, . . . , M — 1} minimizing 

\w n - (w n - +£ n )|, 
and extend the definition of w on [n, (n + 1)) by setting 

w n+t := u» n _ + i n + 7r(w^){t), t G [0, 1). 

Note that (tit)te[o,n) depends only upon (wt)te[o,n)i so t na t the above construction induces 
strategies 

ttW : C[0,n] B[0,n], w ^ (w^) te[0>n] , 
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where w[ n ^ = vbt for t G [0, n) and w n = w n - . Moreover, we can write 

(Wt)te[0,n] = K (n) H = <PnW™ {0) ), ■ ■ ■ , VT^"- 1 )), Cl, • • • , Cn-l) 

for an appropriate measurable function ^>„ : (O[0,n]) n x M n_1 — > D[0, n). 

The main motivation for this construction is the following property. If one has, 
some (wt) 6 C[0, oo) and n £ N, 

lllk-T (B) Hllp,»]IL^]i^i d 

and \\w^ -vr(u;( n ))||[ 0il ] < d, then 

\w n - {w n - +in)\< p 

whence, 

Ik - «>||[n,n+l) = Ikn + ^ ~ On- + £n + 7r(w (n) ) (t)) || [0j i) 
< K - K_ + e n )| + lk W " ^ W ) II [0,1) 

< d/(M _ 1)+d = _Ji_ d 

In particular, if it : C[0, 1] — > D[0, 1] satisfies 

WWX-TriX^W^Kd, 

then for any n E N, 

||ll^-- (rl) WII[0,n]|L<]^I d - 

3 Polynomial decay of D^ e \r\oo) 

The objective of this section is to prove the following theorem. 
Theorem 3.1. There exists a constant k = k{H) G (0, oo) such that 

lim r 11 D^ e \r\oo) = k. 

r— »oo 

Thereafter, n = k(H) will always denote the finite constant defined via equation 
In order to simplify notations, we abridge || • || = || • ||[o,il- 
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Remark 3.2. It was found in [3| (see Theorem 3.5.2) that for finite moments q > 1 the 
entropy coding error is related to the asymptotic behavior of the small ball function of the 
Gaussian measure. In particular, for fractional Brownian motion, one obtains that 



D^(r\q)^- f 



In order to show that D^ e \r\oo) is of the order r H , we still need to prove an appropriate 
upper bound. We prove a stronger statement which will be useful later on. 

Lemma 3.3. There exist strategies ir^ : C[0, 1] — ► C[0, 1], r > 0, and probability weights 
(p^ ) ) weim ( 7r (r )) such that for any q>\, 

llii*-^ (r) POiHL<^ and E[(-i°g^ ( x/ ]1/9 " r - (7) 

In particular, D^ e \r\oo) ~ r~ H . 

The proof of the lemma is based on an asymptotic estimate for the mass concentration 
in randomly centered small balls, to be found in ?|. Let X\ denote a fractional Brownian 
motion that is independent of X with C(X) = C{Xi). Then, for any q £ [l,oo), one has 

E[(-logP(||X-Xi|| < e\X))' 1 } 1 / q « -logP(||X|| <e)^e- 1/H (8) 



as e [ (see pj, Theorem 4.2 and Corollary 4.4). 

Proof. For a given B[0, l]-valued sequence (w n )ngNu{oo}i we consider the following coding 
strategy ^ r '(-\{w n )): let 

rW(w) := T (r \w\(w n )) := inf{n G N : [|w - w n \\ < l/r 11 }, 

with the convention that the infimum of the empty set is oo, and set 



7T 



: = TT {r) (w\(w n )) := W T (r)( w y 



Moreover, let (p n )neN denote the sequence of probability weights defined as 

6 1 

Pn = -n -n, Tl 6 N, 

7r z n z 

and set p^ := 0. 

Now we let (^n) n eNu{oo} denote independent FBM's that are also independent of 
X, and analyze the random coding strategies 7r( r )(-) := n^ r \-\{X n )). With := 
T^(X\(X n )) we obtain 

X(r) :=7r W( X) =X T(r) , 
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and 
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E[(-logp rW ) 9 ] 1/<? < 2E[(logTM) f '] 1 /5 + i g_. (9) 

Given X, the random time is geometrically distributed with parameter P(||X — Xi\\ < 
l/r H \X), and due to Lemma fA.21 there exists a universal constant c\ = c\{q) < oo for 
which 

E[(logTW)«|X] 1 /'? < ci [l + logE[TM|X]] = Cl [l + logl/PdlX-XiH < l/r H \X)}. 
Consequently, 

E^logT^) 9 ] 1 /" = E[E[(logT^) q \X]] 1/q 

<ciE[(l + logl/P(||X-Xi|| <l/r H \X)y}^ (10) 
<ci(l + E[(-logP(p:-Xi|| < l/^lX))^"). 

Due to (jHJ), one has 

E[(-logP(||X-l 1 || < l/r^X))"] 17 " wr, 

so that © and (fTU)) imply that E[(— logp T ( r )) q ] 1 ^ q < C2f for some appropriate constant 
c 2 < oo. In particular, for any r > 0, we can find a C[0, l]-valued sequence (iS^) n eN °f 
pairwise different elements such that 

E[(-logp T(r)(x|(i _M )) )^ 1 /' ? < E[(-logp rW )<f /<? < C2r . 

Now the strategies (-\(wn^ )) with associated probability weights p M := p n (n G N) 
satisfy (J7J). Moreover, D^ e \r\oo) follows since 

H(^(X|(^)))<E[-Iog P « ^ r) ]. 



□ 



Let us now use the coding scheme of Section |2] to prove 
Lemma 3.4. Let n G N, r > and Ar > 1. T/ien 



At- 

D {e \n{r + Ar)|oo) < n _if — r D^ e \r\oo). (11) 



Proof. Fix e > and let tt : C[0, 1] —>■ B[0, 1] be a strategy satisfying 
\\\\X-w(X)\\ M \\ oo <(l + e)D^(r\oo) =: d 
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(12) 



and 

H(vr(X)) < r. 

Choose M := Le Ar J and let vi» be as in Section [2 Note that Ar > 1 guarantees that 
M > e Ar - 1 > e Ar /2, so that 

We let (^ t W )t G [ ,i] = (Xi+t ~ Xi)te[o,i] for i = 1, . . . ,n, and (Ci)i=i,...,n-i be as in Section 
12 for w = X. Observe that, due to the representation 

M(ttW(X)) < M(7r(x(°)), . . . , ttCX^ 1 )), Ci, . . . , Cn-l) 

< M(7r(X(°))) + • • • + M(vr(x("- 1 ))) + H(6, • • • , £„-i) 

< nr + log | range (£i , . . . , Cn-l) I < rer + n log M 

< n(r + Ar). 

Now let 

a n : D[0, 1] -» B[0, re], f » a n (f)(s) = n H f(s/n) 
and consider the strategy 

7f : C[0, 1] -> D[0, 1], / ^ a' 1 o tt^ o «„(/). 

Since a n (X) is again a fractional Brownian motion on [0, re], it follows that, a.s. 

Ar 

\\X - n(X)\\ M = n~ H \\a n (X) - ^ (a n {X))\\ [QM < (1 + e) n ~ H -£—^D^ (r\^) . 
Moreover, 

H(tt(X)) = M(a; 1 o 7r( n )(Q„(X))) = M{tt^{X)) < r. 
Since e > is arbitrary, the proof is complete. □ 
Proof of Theorem 13.11 For r > 0, Ar > 1 and n £ N, Lemma HOI yields 

1 p Ar 

L>(e)( n ( r + Ar)|oo) < - £>( e )(r|oo). 

Th & — 

Now set k := liminf^oo r H D^ e \r\oo) which lies in (0, oo) due to Lemma 13.31 Let e G 
(0, 1/2) be arbitrary, and choose ro, Ar > 1 such that 

r^L>( e )(r |oo) < (1 + 
Ar < ero and 
e~ Ar < e. 
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Then 



£)(e)((l + e ) nro |oo) < 4r — ^£ (e) (ro|oo) 
n n 1 — 2e 



" ((1 +£ )nr ) H l"2e V 

and we obtain that 

lim sup ((1 + e)nr ) D {e \(l + e)nr \oo) < — — '— 

n— >oo t 26 



K. 



Let now r > (l+e)ro and introduce r = r(r) = min{(l + e)nro : n E N, r < (l + e)nro} 
as well as r = r(r) = max{(l + e)nrQ : n G N, (1 + e)nro < r}. Using the monotonicity of 
we conclude that 

limsupr^ D^ e '(r | oo) < lim sup £)^ e ^(r|oo) 

r— »oo r— »oc 

< lim sup (r + (1 + s)ro) H (r\oo) 



< 



r— >oo 

(l + e) 1+H 



1 - 2e 

Noticing that e > is arbitrary finishes the proof. □ 

4 The quantization problem 

Theorem 4.1. One has for any q G (0, oo), 

D^ q \r\q) ~ k— jj, r —* oo. 
We need some preliminary lemmas for the proof of the theorem. 
Lemma 4.2. There exist strategies (7T^ r - ) ) r >o and probability weights (pi)) such that 

\\\\X - tt^POIIII^ < k-^ and - logp^l )(x) < r, in probability. 
Proof. Let e > and choose ro > 2 such that 



,r -iy " 2 
By Theorem 13.11 

DM((l + e /2)r|oo)<icj^^ 
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In particular, there exists n > tq V | log(ro + 1) and a map it : C[0, 1] — > B[0, 1] such that 

IHlx-TTpoiircnlL <«^— ^ = :d and h(tt(x)) < (i + e/2)n. 

For n G N, let tt^ and 93 n be as in Section [2] for M = [Yo] , d and ir. Then by © 

llll^-^Wii IM |L<«^^i<K-L. (i 3 ) 

For t^ ),...,^- 1 ) G im(7r) and h,...,k n -i G + : fc = 0, . . . , M - 1}, let p^ 
be defined as 

n-l 



i=0 



The (pw^) define probability weights on the image of (p n . Moreover, 

n-l 

and the ergodic theorem implies 

lim -- logp!"\ = logM + M(7r(X)), a.s. 

n— >oo n (^t)te[o,n] 

Note that log M + H(vr(X)) < (l+e)ri. 

Just as in the proof of Lemma 13.41 we use the self similarity of X to translate the 
strategy into a strategy for encoding (X t ) t ^[o : i}- For n G N, let 

a n : B[0, 1] -» D[0, n], / ^ (<*„/)(*) = f(t/n) 

and consider := p ™\ w ) and ^"H^) := c^n 1 ° a n(w). Then 

- lo g£~S) ( x) = ~ lo S^2)( an (x)) ~ ( X + £ ) nri ' in P robabillt y 

and by lfT3|> 

||||x-^)(x)|| [0il] || oo = |||K 1 (a n (x)-^)K(x)))|| [0il] || oo 



^|||K(X)-7rW(a4X))|| [0jn] || M 

^lli^- (n) wii^L<^- 
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By choosing ir^ = tt^ and (p^) = (p^) for r G ((n — l)r\,nr-\\, one obtains a coding 
scheme satisfying 

llll*-- (r) (*)UL<«i 



and 



lo gP^M(X) ~ ( X + £ ) r ' in Probability, 



so that the assertion follows by a diagonalization argument. □ 

Remark 4.3. In the above proof, we have constructed a high resolution coding scheme 
based on a strategy it : C[0, 1] — * D[0, 1], using the identity 7f n = a^ 1 o o a n . This 
coding scheme leads to a coding error which is at most 

M „,.„ ,„ h 



M 



I ||||X-vr(X)|| [0il] || oo n- w . (14) 



Moreover, the ergodic theorem implies that, for large n, ir n (X) lies with probability almost 
one in the typical set {w G D[0, 1] : — \ogpw < n(M.(ir(X)) + log M + e)}, where e > is 
arbitrarily small. This set is of size exp{n(H(7r(X)) + log M + e)}, and will serve as a close 
to optimal high resolution codebook. It remains to control the case where Tr n (X) is not in 
the typical set. We will do this in the proof of Theorem 14. II at the end of this section (see 

mi). 

Proposition 4.4. For q > 1 there exist strategies (ir^) r >o and probability weights (pw ) 
such that 

! E[(-logpW )«]!/« 
LX-7rW(X)) <k-=t and lim = 1. (15) 

In addition, for any e > one has 

hm sup p(-log^ (x) < (l-e)r,||X-7r(X)|| < «^) = 0, (16) 



where the supremum is taken over all strategies it : C[0, 1] — > D[0, 1] and over all sequences 
of probability weights (p w ). 

Proof. Let q > 1 and let (r > 0) be a strategy and {pw' 1 ^) a sequence of probability 
weights as in Lemma 14.21 Moreover, let 7^ and {pw^) (r > 0) be as in Lemma 13.31 for 
2q. We consider the maps (w) := — logp^'ri anci ft? v 10 ) := — l°gP m > an d set 

7T^ '(ill) 7T£ (w) 



tvP(w) if 4' ] H < (1 + 5)r, 
7^2 ^ (w) otherwise, 
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E[(-logp$ l )(x) ) q ] 1/q <(l + S)r. 



for some fixed 6 > 0. Then one obtains, for p$ = ^(pw' 1 ^ +Pu>' 2 ^) and % := {w £ C[0, 1] : 
K ( {\w) < (l + <5)r}, 

E[(-log2^ )(x) )<f /g < E[i Tr (x)4\xy}^ + E[i Tr 4x)4\xy}^ 

< (l + 5)r + F{XG% c ) 1 / 2q E[4\x) 2q ] 1 / 2q . 

The definitions of ir[ r } and vrf ) imply that lim^oo P(X G T r c ) = and E[k^ ) (X) 2< ?] 1 / 2 '? « 
r. Consequently, 

r )(X)> 

Since 5 > can be chosen arbitrarily small, a diagonalization procedure leads to strategies 
7f( r ) and probability weights (p^ ) with 

||[|^-^ r) WII[o,i]IL^' c ^ and E[(-logP iM( x)) 9 ] 1/9 <^ 

which proves the first assertion. 

It remains to show that for arbitrary strategies 7t^ r \ r > 0, and probability weights 
(pi ): 

hm p(- logP% (x) < (1 - e)r, ||X - *«(*)|| < k^) = 0. (17) 
Without loss of generality, we can assume that 

\\\\x-* {r \m m \L<^- as) 

Otherwise we modify the map it^ for all w E C[0, 1] with ||it? — 7f (if) ]| > nr~ H in such 
a way that (|T%|) be valid. Hereby the probability in lfT7j) increases and it suffices to prove 
the statement for the modified strategy. Let us consider 



7r^ r ^(w) else. 



Then the probability weights p( r ) := |(p^ +p^ r ') satisfy 

E[(-log2^ ( ) x) )"] 1 / 9 < E[(-logjf ( ^)<f /9 < r . 

Recall that 

hence by Theorem 13.11 one has E[— logp^,,^.] > IH(7r( r )(X)) > r. Lemma IA.1I thus 
implies that 

(r) 

— logfr^™ ~ r, in probability. 
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In particular, 



- lo gpJv)pn - ~ lo § 2 ^1m(x) £ r, in probability, 



>(X) - to "*VM(X) 

which implies (|17|) . □ 

Proof of Theorem 14. 11 We start by proving the lower bound. Fix q > 0, let C r , r > 0, 

denote arbitrary codebooks of size e r , and let tt^ : C[0, 1] — ► C r denote arbitrary strategies. 
Moreover, let (p$) be the sequence of probability weights defined as pw = l/\C r \, w £ C r . 
Then — logp^ r) ^ < r a.s., and the above lemma implies that for any e £ (0, 1), 

lim P(||X - TT {r) {X)\\ < k ^~^ H ) = °- 



Therefore, 

E{\\X -ir^{X)\\i] 1 /o > K [L ~„ r F(\\X -^ r \X)\\ > K^—p^-^j 



which proves the lower bound. 

It remains to show that D^ q \r,q) < n/r H . By Lemma 14.21 there exist strategies 
and probability weights (p$) such that 

\\\\X - vr^^X)!!)^ < k-^j and - logp^p^ < r, in probability. 

Furthermore, due to Theorem 4.1 in 0, there exist codebooks C r of size e r with 

1 

77 ' 



E[min \\X - wW 2 ^ l/2q « — 



weCr ' ' 

- (r) 

We consider the codebook C r := C r U {w : — logp^, < (1 + e/2)r}. Clearly, C r contains at 
most e r + e^ 1+£//2 ^ r elements. Moreover, 

E[min \\X - w\\ q ] 1/q < E[l Cr (vr( r )(X)) {^) q } 1,q 



+ E[1 C c(ttW(X)) min ||X - w\\ q ] l ' q (19) 

2</ 1 l/2<z 



E[min ||X — u> 

Since lmv^oo P(7r( r )(X) C r ) = and the succeeding expectation is of order 0(l/r H ), 
the second summand is of order o(l/r ). Therefore, for r > 2/e 

D (9) ((l + e)r\q) < E[min - w\\ q ] 1/q < k\. 

By switching from r to f = (1 + e)r, we obtain 

£>(«)(f|g)<«(l + e ) H i 
Since £ > was arbitrary, the proof is complete. □ 
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5 Implications of the equivalence of moments 



In this section we complement Theorem 14, II by 
Theorem 5.1. For arbitrary q £ (0, oo], one has 

D (e) (rk )^_L 

The proof of this theorem is based on the following general principle: if the asymptotic 
quantization error coincides for two different moments q\ < qi, then all moments q < q2 
lead to the same asymptotic quantization error and the entropy coding problem coincides 
with the quantization problem for all moments q < qi- 

Let us prove this relationship in a general setting. E and E denoting arbitrary mea- 
surable spaces and d : E x E — > [0, oo) a measurable function, the quantization error for 
a general E'-valued r.v. X under the distortion d is defined as 



D^(r\q) = inf E[mmd(X,x 



CcE L x£C 



where the infimum is taken over all codebooks C C E with \C\ < e r . In order to simplify 
notations, we abridge 

d(x, A) = inf d(x, y), x G E, A C E. 

yeA 

Analogously, we denote the entropy coding error by 

D^irlq) = infE[d(X, X) q ]^ q , 
x 

where the infimum is taken over all discrete E- valued r.v. X with EI(X) < r. 

Then Theorem 15.11 is a consequence of Theorem 14.11 and the following theorem. 



Theorem 5.2. Assume that f : [0, oo) — > R + is a decreasing, convex function satisfying 

dr j 

f(r 



limsup ® r . \ K ' < oo, (20) 



and suppose that, for some < q\ < qz, 

D^(r + log 2^) ~ D^(r\q 2 ) > f(r). 

Then for any q > 0, 

D^(r\q)>f(r). 

We need some technical lemmas. 
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Lemma 5.3. Let < q\ < 02 and f : [0, oo) — > R + . // 

£>(ff)( r + log 2|gi) (r|(te) ~ /(r), 

i/ien /or any e > 0, 

lim sup P(d(X,C) < (1 - e)/(r)) = 0. 
r-»oc CcE . 

|C|<e r 

Proof. For r > 0, let C* denote codebooks of size e T with 

E^O'f/^/W. (21) 

Now let C r denote arbitrary codebooks of size e r , and consider the codebooks C r := C* UC r . 
Using (|21() and the inequality q\ < q2, it follows that 

f(r) > E[d(X,C r ) q2 ] 1/q2 > E[d(X,C r ) qi ] l/qi > D®(r + log 2^) ~ f(r). 

Hence, Lemma lA. II implies that 

d(X,C r ) ~ f(r), in probability, 

so that in particular, 

d(X,C r ) > f(r), in probability. 

□ 

Lemma 5.4. Assume that f : [0, oo) — * M+ is a decreasing, convex function satisfying 
XM) and 

lim sup P(d(X,C) < f{r)) = 0. 

r^oo CcE . 
\C\<e r 

Then for any q > 0, 

D {e) {r\q)>f{r). 

Proof. The result is a consequence of the technical Lemma lA.31 Consider the family T 
consisting of all random vectors 

(A,B) = (d(X,X)i,- log Pj{ ), 

where X is an arbitrary discrete E- valued r.v. and (p w ) is an arbitrary sequence of prob- 
ability weights on the range of X . Let f(r) = f(r) q , r > 0. Then for any choice of X 
and (p w ) and an arbitrary r > 0, the set C := {w £ E : — logp^ < r} contains at most e r 
elements. Consequently, 

¥(d(X, Jty < />), - logp^ < r) = ¥(d(X, X) < f(r), X G C) < P(d(X, C) < f(r)). 
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By assumption the right hand side converges to as r — > oo , independently of the choice 
of X and (p w ). Since / satisfies condition (|27|). Lemma TA.3I implies that 

D^ e \r\q) = inf E[d(X, Xf) 1 ^ = inf E[A] l ' q > /(r) 1 / 9 = /(r), 
X:H(X)<r 

where ,F r = {4 : (A, J3) e f , < r}. □ 
Theorem 15,21 is now an immediate consequence of Lemma 15.31 and Lemma 15.41 

6 Coding with repect to the L p [0, l]-norm distortion 

In this section, p E [1, oo) is fixed. In contrast to the previous sections, we consider entropy 
coding and quantization of X in -L p [0, 1], i.e. E = L p [0, 1] and d(f,g) = \\f — g\\LP[o,i]- I n 
order to treat these approximation problems, we need to introduce Shannon's distortion 
rate function. It is defined as 

D(r\q) = mf\\\\X - X\\ LP[0A] \\ q , 

where the infimum is taken over all E- valued r.v.'s X satisfying the mutual information 
constraint I(X;X) < r. Here and elsewhere / denotes the Shannon mutual information, 
defined as 

7l°g^if-^xx ifPx,x« p *— x 
oo else. 



I(X;X) 



The objective of this section is to prove 
Theorem 6.1. The following limit exists 

k p = kJH) = lim r H D(r\p) G (0, oo), (22) 
and for any q > 0, one has 

D^(r\q)^D^(r\q)^K p ^. (23) 
We will first prove that statement (|23|) is valid for 

K p : = lim inf r D(r\p). 

r— >oo 

Since D(r\p) is dominated by D^ q \rp), the existence of the limit in (|22[) then follows 
immediately. Due to Theorem 1.2 in 4|, the distortion rate function D(-\p) has the same 
weak asymptotics as D^(-\p). In particular, D{r\p) « r~ H and n p lies in (0,oo). 
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We proceed as follows: decomposing X into the two processes 

X« = (X t -X [t} ) t > and X^ = (X [t] ) t > , 

we consider the coding problem for X^ and X^ in L p [0, n] (n G N being large). We 
control the coding complexity of the first term via Shannon's Source Coding Theorem 
(SCT) and use a limit argument in order to show that the coding complexity of X^ 
is asymptotically negligible. We recall the SCT in a form which is appropriate for our 
discussion; for n € N, let 

d P {f,g) = (f\f{t)-g{t)\t>dt) l ' P 

and 

\f(t)-g(tW-) . 

Then d n (f,g) = d ntP (f, g) p , n £ N, is a single letter distortion measure, when interpreting 
the function /|[o, n ) as the concatenation of the "letters" f(°>, . . . , /^ n ~ 1 - ) , where /w = (/(*+ 
£))te[o,i)- Analogously, the process X^ corresponds to the letters -X"' 1 '*' := (Xi+t)te [o,i)j 
i G No- Since (A^ 1 '^)^^ is an ergodic stationary C[0, l)-valued process, the SCT implies 
that for fixed r > and e > there exist codebooks C n C L p [0,n], n E N, with at most 
exp{(l + e)nr} elements such that 

lim P(4(A (1) ,C n ) < (1 + e)D{r\pf) = 1. (24) 

n— >oo 

A proof of this statement can be carried out by using the asymptotic equipartition property 
as stated in 2] (Theorem 1). The proof is standard and therefore omitted. For further 



El °r 0. 



details concerning the distortion rate function one can consult 

First we prove a lemma which will later be used to control the coding complexity 
of A( 2 ). 

Lemma 6.2. Let (Zj)j g N be an ergodic stationary sequence of real-valued r.v. 's and let 
S n = Yli=i ^i, n S No- Then there exist codebooks C n C W 1 of size exp{nE[log(|Zi|/2e + 
2)] + nc} satisfying 

lim P(min IIS? - SIU ) < e) = 1, 

where S 1 " denotes (S'i)i=i ) ...,n) c is a universal constant and \\ ■ ||jn denotes the maximum 
norm on M. n . 

Proof. Let c > be such that (p n )n£Z defined through 

1 

Pn ~ e (|n| + l)2 
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is a sequence of probability weights. For a given sequence (s n ) ne pj, we define a reconstruc- 
tion (s„) recursively. The construction depends on a parameter e > 0. Let so = and 
suppose that Sq = (si)i=o,...,n is already defined. Then we choose a £ n+ i G 2elR minimizing 
the distance 

- (Sn + £n+l)| 

and set := s„ + £n+l- This defines maps 7r n : 1" — > M n , i— > 7r n (s") := s". We equip 
the range of 7r n with a sequence of probability weights via 

n 

» 



i=i 

Then 



-logp^ <2^1og(|6|/2e + l) + nc. 

i=i 



Now consider 7r n (5i). Let £ n = £ n ((>%)) be as above when replacing the deterministic 
argument (s n ) by (S* n ). Then 

\Cn ~ Z n \ = \S n — S n —i — S n + Sn,_x| ^ 2e 
and, hence, |£ n | < |Z„| + 2e. Consequently, 

1 1 n 

— logpg < 2-^^(1^1/25 + 2) + c - 2E[log(|Z 1 |/2e + 2)] + c, 

i=l 

where the convergence follows due to the ergodicity of (Z n ). Therefore the codebooks 



C n := {Si £R":-- logpfc? < 2E[log(|Z 1 |/2e + 2)] + 2c} 



satisfy the required assertion. □ 

We now use the SCT combined with the previous lemma to construct codebooks that 
guarantee almost optimal reconstructions with a high probability. 

Lemma 6.3. For any e > there exist codebooks C r , r > 0, of size e r such that 

lim F(dJX,C r ) < (l+e)K p r~ H ) = 1. 

r— >oo 

Proof. Let e > be arbitrary and c be as in Lemma 16.21 We fix ro > ( gn^r ) 1 such 
that 
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for all r >tq. Then choose r\ > ro with 

D(n\p) < {l + e)K p r± H . 
We decompose X into the two processes 

xi 1] =X t -X w and xP=X lt} . 
Due to the SCT there exist codebooks Cn C L p [0,n] of size exp{(l+e)nri} satisfying 
lim P« p (X« C^f < (1 + 2e)f^r- pH ) = 1. 

We apply Lemma 16, 21 for . Note that 

f\Xi\ \ /EIXJ \ 

E1 ° g ( 2^ + V + C ~ ° g ( 2^ + 2 ) +c 

Since > ^xl\ > ^ follows that E ^" ; 1 1 = T " 1 2 ^' 1 ^ > 2, so that 

= -log(eK p r^ H ) +c + logE|Xi| < er, 

____ /2") 

due to (|25|). Hence, there exist codebooks C n C L p [0, n] of size expjenri} with 

lim ¥(d n . p {X^ 2 \C^) < eKp^rr) = 1. 



Let now C n := C^, + C n 2 ' denote the Minkowski sum of the sets Cn and C n 2 \ Then 
\C n \ < ex p{(l + 2e)nri}, and one has 

F(d niP (X,C n ) < (1 + 3e)K p rr H ) > P^n^xM,^) < (1 + 2e)n p r^ H and 

d n , p (x( 2 ),C( 2 ))<£K p rr H )^l. 

Consider the isometric isomorphism 

n : LP[0,1] -> (L p [0,n],< p ), f ^ f(nt), 
and the codebooks C n C £ p [0, 1] given by 

Then iW = n- H (3- 1 {X) is a fractional Brownian motion and one has 
d p {X^ n \C n ) = d ntP (p n (xW),p n (C n )) = n- H d n , p (X,C n ). 
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Hence, the codebooks C n are of size exp{(l + 2e)nri} and satisfy 

F(d p (X,Cn) < (l + 3e) Kp (n ri )- H )) = F(d n>p {X,C n ) < (1 + 3e)K p r^ H ) 

as n — ► co. Now the general statement follows by an interpolation argument similar to 
that used at the end of the proof of Theorem 13.11 □ 

Proof of Theorem 16.11 Let q > 1 be arbitrary, let be as in the above lemma for 

(2) 

some fixed e > 0. Moreover, we let Q denote codebooks of size e r with 

E&(jr4 2 W/(29) « _L. 

Then the codebooks C r := Cr UC^ contain at most 2e r elements and satisfy, in analogy 
to the proof of Theorem 14.11 (see (|T§|) ) , 

E[d p (X,C r ) q ] 1/q <(l+e) Kp ^ 

Since e > is arbitrary, it follows that 

DM(r\q)< Kp ±. 

For q > p the quantization error is greater than the distortion rate function D(r\p), so 
that the former inequality extends to 

lim r H D {q) (r\q) = k p . 

I — >oo 

In particular, we obtain the asymptotic equivalence of all moments q±, q^ greater or equal 
to p. Next, an application of Theorem 15.21 with d(f,g) = d p (f,g) q implies that for any 
q>0, 

DM(r\q)> Kp ± 

which establishes the assertion. □ 

Appendix 

Lemma A.l. For r > 0, let A r denote [0,oo)-valued r.v. 's. If one has, for < q± < q2 
and some function f : [0, oo) — * M+, 

E[^i]i/9i „ ElA? 2 ] 1 ^ 2 ~ /(r), (26) 

then 

A r ~ f(r), in probability. 
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Proof. Consider 

A r :=A?/R[A?], 
and q2 = lijqx- Then (|2l)|) implies that 

E[i« 2 ]V?2 „ E[A r ] = 1 

Denoting AA r := A r — 1 and g(x) := x 92 , we obtain 

E[l« 2 ] = E[l + AA r g'(l) + 5 (1 + AA r ) - (1 + Ai r g'(l))} 
= l + E[g(A r )-(l + AA r g'(l))} 

Due to the strict convexity of g, for arbitrary e > there exists 5 > such that 

#0 + 1) > 1 + xg'(l) + 6, for x £ [-1, 1 - e] U [1 + e, oo). 

Consequently, 

E[i? 2 ] > 1 + (5F(|Ai r | > e). 
Since Hindoo E[Ar 2 ] = 1, it follows that lim^oo P(| AA r \ > e) = 0. Hence, 

A- = EL4-? 1 ] 1 / 91 Ay qi ~ Ef^ 1 ] 1 / 91 ~ /(r), in probability. 



□ 



Lemma A. 2. Let q > 1. There exists a constant c = c(q) < oo such that for all [1, oo)- 
valued r.v. 's Z one has 

EfOogZ) 9 ] 1 /" <c[l + logE[Z]]. 

Proof. Using elementary analysis, there exists a positive constant ci = c\(q) < oo such 
that tf)(x) := (logx) 9 + cj logx, x G [l,oo), is concave. For any [1, oo)-valued r.v. Z, 
Jensen's inequality then yields 

E^logZ) 9 ] 1 /" < E[^(Z)] 1/9 < ^(EiZ]) 1 ^ 

< logE[Z] + q /9 (logE[Z]) 1/9 < c[l + logE[Z]], 

where c = c(g) < oo is an appropriate universal constant. □ 

Lemma A. 3. Let f : [0, oo) — > R + be a decreasing, convex function satisfying , lirn J ._ >00 f(r) - 
and 

_ r £t y( r ) 

limsup jj— < oo, (27) 
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and T be a family of [0, oo] 2 -valued random variables for which 



lim sup F(A < f(r),B < r) = 0. (28) 
r ^°° (A,B)er 



Then the sets of random variables T r defined for r > through 

T r := {A : (A, B) G J-, EB< r} 

satisfy 

inf KA > f(r) 

as r — ► oo. 



rj-j- 

Proof. Fix R > 0, positive integers / and N, and define A := -^f(R), 

n := l -±^R, i = -N,-N + l,.... 
For (A,B) epR, we define 

Ta )B := {$i £ {—N + 1, . . . , 1} such that A < f(n) and B < n}. 
Then we have 

7-1 



E [A + A.B] > ^ E[lr AjS l [ri!j . j+l) (B)(A + An)] 

i=-JV 

i-i 

> E E [ 1 r A , B l[r l ,r l+1 )(^)(/(r J+ l) + \n)] 
i=-N 

= E E [ 1 T4, B l[r- i ,r i+1 )(^)(/(^+l) + M + l-A^) 



i=-JV 
7-1 



> E E [ 1 T AlB l[r il r i+1 )( B )(/( i2 )+ Ai2 - A 



where the last inequality follows from the fact that 

/(#) + XR = inf [f(r) + Ar] 

r>0 

by the definition of A and the convexity of /. Now, fix e > and pick N > 1/e, I > 2N/e 
and Rq so large that 

H r A,B) > 1 - | for all R > R and all (A, B) £ Tr. 
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Using Chebychev's inequality, we then obtain for R> Rq, 



E[A + \B] > (l-e){f{R) + XR)l^l-F(T c )- 
> (l -_)(/(#) +Ai?)(l-|-|). 




> R— 

~ N 



) 



) 



Hence, 



XR + EA > (1 - e) 2 (f(R) + XR) 



and therefore 



EA > (1 - e) 2 f{R) + XR ((1 - ef - l) . 



Using the definition of A and (|27|) , as well as the fact that e > is arbitrary, the conclusion 
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